Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
Multi-modal summarization model based on semantic relevance analysis
Yuxiang LIN, Yunbing WU, Aiying YIN, Xiangwen LIAO
Journal of Computer Applications    2024, 44 (1): 65-72.   DOI: 10.11772/j.issn.1001-9081.2022101527
Abstract227)   HTML3)    PDF (2804KB)(149)       Save

Multi-modal abstractive summarization is commonly based on the Sequence-to-Sequence (Seq2Seq) framework, and the objective function optimizes the model at the character level, which searches locally optimal results to generate words and ignores the global semantic information of the summary samples. It may cause a problem of semantic deviation between the summary and multimodal information, resulting in factual errors. In order to solve the above problems, a multi-modal summarization model based on semantic relevance analysis was proposed. Firstly, the summary generator based on Seq2Seq framework was trained to generate candidate summaries with semantic multiplicity. Secondly, a summary evaluator based on semantic relevance analysis was applied to learn the semantic differences among candidate summaries and the evaluation mode of ROUGE (Recall-Oriented Understudy for Gisting Evaluation) from a global perspective, so that the model could be optimized at the level of summary samples. Finally, the summary evaluator was used to carry out reference-free evaluation of the candidate summaries, making the finally selected summary sample as similar as possible to the source text in semantic space. Experiments on benchmark dataset MMSS show that the proposed model can improve the evaluation indexes of ROUGE-1, ROUGE-2 and ROUGE-L by 3.17, 1.21 and 2.24 percentage points respectively compared with the current optimal MPMSE (Multimodal Pointer-generator via Multimodal Selective Encoding) model.

Table and Figures | Reference | Related Articles | Metrics